The Development of the 1997 Cmu Spanish Broadcast News Transcription System

نویسندگان

  • Juan M. Huerta
  • Eric Thayer
  • Mosur Ravishankar
  • Richard M. Stern
چکیده

This paper describes the 1997 CMU DARPA Hub 4 Spanish Broadcast News Transcription system. The system we present is based on the CMU SPHINX-III recognizer and uses a single set of acoustic and language models. The decoding process is performed in two passes: a Viterbi search and a directed acyclic graph (DAG) search are performed on the first recognition stage. The second recognition stage is similar to the first stage except that it is performed using models adapted through maximumlikelihood linear regression (MLLR). We describe the issues relating to the design and development of the acoustic models, language models and lexicon. Developmental results and an analysis are presented.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Broadcast News Transcription System

This paper describes the 1998 CMU Hub 4 Spanish broadcast news transcription system. We focus on the development and improvements of the system with respect to the 1997 system. Both the 1997 and 1998 systems were developed using exactly the same acoustic and language model training material, thus the improvements obtained resulted from a better utilization and modeling of these corpora and a be...

متن کامل

The 1997 CMU Sphinx-3 English Broadcast News Transcription System

This paper describes the 1997 Hub-4 Broadcast News Sphinx3 speech recognition system. This year’s system includes fullbandwidth acoustic models trained on Broadcast News and Wall Street Journal acoustic training data, an expanded vocabulary, and a 4-gram language model for N-best list rescoring. The system structure, acoustic and language models, and adaptation components are described in detai...

متن کامل

The need to create a media block for the convergence of overseas news networks

As a general diplomacy arm of the Islamic Republic of Iran, VoSiMa has extensive activities in international broadcasting of its radio and television programs. These programs are broadcast in different languages, such as English, French, Azeri, Arabic, and ... for regional and transnational audiences. The large volume of the organization's international activities is in the form of news and new...

متن کامل

Spanish broadcast news transcription

We describe the Sail Labs Media Mining System (MMS) aimed at the transcription of Castilian Spanish broadcastnews. In contrast to previous systems, the focus of this system is on Spanish as spoken on the Iberian Peninsula as opposed to the Americas. We discuss the development of a Castilian Spanish broadcast-news corpus suitable for training the various system components of the MMS and report o...

متن کامل

The 1997 Bbn Byblos System Applied to Broadcast News Transcription

In this paper, we describe the BBN Byblos system used for the 1997 DARPA Hub-4 Broadcast News evaluation and discuss numerous improvements made to the system in 1997. We focused our e ort entirely upon the two conditions containing studio-quality uncorrupted speech from native speakers, the so-called F0 (prepared speech) and F1 (spontaneous speech) conditions. In particular, we did not bother t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998